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A.  OVERVIEW 


SECTION  I 
INTRODUCTION 


The  advance  of  large-scale  integrated  circuit  (LSIC)  technology  has  spawned  many  new  technologies 
and  made  many  others  economically  feasible.  Military  signal-processing  functions  found  in  such  applica¬ 
tions  as  forward-looking  infrared  (FLIR)  radar,  guidance  and  control,  and  electronic  countermeasure 
(ECM)  systems  have  been  significantly  impacted.  In  particular,  image-processing  systems  for  video  band¬ 
width  reduction;  FLIR  automatic  cueing;  target  detection,  classification,  and  tracking;  and  image  under¬ 
standing  must  apply  LSIC  technology  to  be  affordable  in  system  constraints  such  as  size,  weight,  power 
dissipation,  cost,  and  reliability.  Integrated  circuits  developed  for  such  applications  are  most  affordable 
when  they  can  be  used  on  several  different  programs.  Such  robustness  can  be  accomplished  by  flexible, 
programmable  designs  or  by  functional  designs  that  efficiently  implement  commonly  used  functions  in  a 
manner  that  allows  them  to  be  parameterized  for  each  application. 

The  Programmable  Image  Processing  Element  (PIPE)  is  an  example  of  a  parameterized  functional 
design.  It  implements  the  sum-of-products  operator  common  to  many  image  processing  problems: 

y-'l 

»  =  l) 


where  the  term  Wj  represents  a  set  of  fixed,  programmable  weighting  coefficients  and  X[  represents  a  set 
(or  /-point  sequence)  of  input  samples.  More  specifically ,  if  l  ~  9,  this  equation  can  implement  the  3-  by 
3-pixel  window  operator  used  in  image  processing  for  high-  and  low-pass  filtering,  edge  enhancement, 
and  edge  crispening.  This  same  function  can  be  used  to  calculate  the  coefficient  of  transforms  such  as 
Fourier,  cosine,  Hadamard,  and  Harr  used  in  image  and  signal  processing.  It  is  also  applicable  in  other 
signal-processing  areas  such  as  recursive  and  nonrecursive  digital  filtering. 

Under  Contract  F33615-80-C-1 180,  the  key  to  program  success  was  the  use  of  distributed 
arithmetic  techniques  to  implement  the  sum-of-products  operator  without  using  digital  multipliers, 
which  could  significantly  impact  image-processing  systems. 

B.  OBJECTIVE 

This  contract  represents  the  second  phase  of  a  previous  contract  (F336 1 5— 79— C— 1 763).  The  PIPE 
Phase  I  contract  addressed  only  the  design  and  photomask  fabrication  of  the  PIPE  LSIC.  The  Phase  II 
emphasis  was  on  fabrication  and  testing  of  the  PIPE  LSIC  developed  under  Phase  I  and  the  design  and 
fabrication  of  a  brassboard  to  demonstrate  the  versatility  of  the  PIPE  LSIC.  There  were  three  major 
efforts: 
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PIPE  LSIC  development,  the  objective  of  which  was  to  fabricate  the  PIPE  LSICs  using  the 
photomasks  generated  in  Phase  I  and  perform  functional  evaluation.  If  any  device  errors 
were  discovered,  design,  photomask  generation,  processing,  and  testing  were  to  be 
repeated. 

Brassboard  demonstration  system,  with  which  to  demonstrate  the  ability  of  the  PIPE  LSIC  to 
implement  image-processing  algorithms  on  real-time  video  data. 

Brassboard  demonstration  at  AFWAL  to  provide  a  complete  exchange  of  technical  informa¬ 
tion  regarding  the  PIPE  LSICs  and  brassboard. 

C.  SUMMARY 

Texas  Instruments  has  successfully  completed  all  of  the  objectives  of  the  PIPE  Phase  II  contract. 

1.  PIPE  LSIC  Development 

Figure  1  is  a  block  diagram  of  the  PIPE  LSIC.  The  input  data  A,  B,  and  C  are  parallel  words  that  can 
be  loaded  into  input  latches  either  serially  or  in  parallel.  Users  control  the  mode  of  operation  via  the 
parallel/serial  select  control  line,  allowing  the  PIPE  LSIC  to  operate  on  either  9  X  1  or  3  X  3  blocks  of 
data.  In  the  serial  mode,  all  data  are  loaded  through  the  C  input  pins  and  sequentially  clocked  into  the 
latches  in  nine  sample  periods.  In  the  parallel  mode,  data  are  loaded  through  the  A,  B,  and  C  input  pins 
into  three  separate  input  latches  and  then  sequentially  clocked  into  the  other  latches;  thus,  three  sample 
periods  are  required  to  load  all  the  input  latches. 

Bit-parallel  words  in  the  input  latches  are  converted  into  bit-serial  words  by  the  parallel-to-serial 
registers;  outputs  of  these  registers  form  the  9-bit  memory  address.  Shift-and-accumulate  operations  at 
the  memory  outputs  complete  the  sum  of  products.  Users  can  designate  either  signed  or  unsigned 
arithmetic  for  these  operations. 

Tristate  output  latches  are  provided  for  off-chip  buffering.  All  timing  and  control  for  the  parallel-to- 
serial  registers,  the  memory,  shift-and-accumulate,  and  the  output  buffers  are  generated  on  the  chip  using 
a  simple  shift-register  controller. 

Full  8-bit  input  and  20-bit  output  operations  of  the  PIPE  require  59  pins  (Table  1).  This  can  be 
reduced  to  40  pins  if  6-bit  input  data  and  only  8  bits  of  the  output  are  used.  The  pins  required  for  word- 
length  selection,  input  type  (parallel  or  serial),  and  data  format  (2's  complement  or  magnitude)  can  be 
eliminated  by  on-chip  bonding  to  the  VU[)  or  ground  pin  for  fixed  applications. 

Figure  2  is  a  photomicrograph  of  the  PIPE  LSIC.  The  LSIC  is  implemented  in  N-channel  metal 
oxide  semiconductor  (NMOS)  technology  using  conservative  5-Atm  design  rules.  An  erasable,  program¬ 
mable  ROM  is  used  to  implement  the  memory  function  of  the  ROM-accumulate  algorithm.  Total  bar 
size  is  approximately  43  mm2  (240  X  270  mils2). 

I C  processing  required  two  passes  because  some  minor  layout  errors  were  discovered  in  the  fast-pass 
ICs  (the  shift  register  controller,  clock  driver,  and  tristate  output  latches),  which  prevented  them  from 
being  completely  functional  but  did  not  prevent  detailed  evaluation  of  the  remaining  circuits.  The 
redesign,  photomask  generation,  processing,  and  testing  steps  were  repeated. 
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Figure  2.  1‘rogrammablr  Imagr  1‘rm-vv.itig  FU-nirot  (IMI’F) 


TABLE  I.  PIPE  LSIC  INPUT/OUTPUT  PIN  REQUIREMENTS 


Control  I.ine(s) 

No.  Pins 

Function 

Data-A  input 

8 

Input  data 

Data-B  input 

8 

Input  data 

Data-C  input 

8 

Input  data 

Parallel/ serial  select 

1 

Determines  modes  of  chip  operation  (3  X  3  or  9  X  1 
operations) 

Input  strobe 

1 

Indicates  valid  input  data  and  latches  data  into  input 
latches 

Word  length  (BCD  code) 

3 

Defines  word  length  of  input  data 

Master  clock 

1 

Square-wave  clock  provided  for  system  timing 

Load 

1 

Initiates  parallei-to-serial  data  conversion 

Data  valid 

1 

Output  signal  indicating  complete  calculation 

Enable  (EN) 

1 

Used  to  tristate  or  enable  output  bus 

Data  too  fast 

1 

Inhibits  input  srobe  during  parallel  load  operations 

Outputs 

20 

Output  data 

2's  complement  coefficients  (TC(  1 

1 

Used  to  set  sign  bit  of  output  word 

2's  complement  data  (TCD) 

1 

Defines  signed  or  unsigned  magnitude  data  operation 

v™> 

l 

Single  +  5  V  operating  supply 

Vp,, 

1 

Normally  at  +5  V.  but  taken  to  -!-25  V  for  EPROM 
programming 

GN|) 

1 

Substrate  bias 

Total 

59 

During  the  redesign  phase,  an  automated  schematic  verification  of  the  PIPE  LSIC  was  perfc  tried. 
This  verification  compared  the  actual  LSIC  layout  with  the  circuit  schema! ic  and,  with  the  exception  of 
some  minor  device  size  deviations  in  the  TTL-to-MOS  dock  driver  and  the  NOR  buffers  in  the  input 
controller  section,  no  layout  errors  were  discovered.  After  the  schematic  verification,  new  photomasks 
were  generated  (7  of  the  12  photomasks  were  replaced)  and  second-pass  PIPE  LSICs  were  processed  and 
evaluated.  The  second-pass  ICs  were  100-percent  functional. 

Figure  3  shows  a  PIPE  LSIC  packaged  in  a  64-pin  dual  in-line  package  (DIP).  The  operating  charac¬ 
teristics  of  the  PIPE  LSIC  are  listed  in  Table  2. 

2.  Demonstration  Brassboard 

To  demonstrate  the  PIPE  LSIC,  a  flexible,  completely  self-contained  brassboard  (Figure  4)  operat¬ 
ing  in  or  near  real-time  was  developed.  It  is  17.0  X  18.5  X  9.0  inches,  weighs  46  pounds,  and  dissipates 
87  W.  Although  simple,  the  brassboard  operates  the  PIPE  LSIC  in  both  its  serial  and  parallel  modes  and 
demonstrates  its  processing  of  vector/ transforms  and  neighborhood  operators.  The  brassboard  accepts 
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TABLE  2.  PIPE  CHARACTERISTICS 


Maximum  operating  voltage 

Maximum  programming  voltage 

Maximum  strobe  frequency 

Maximum  clock  frequency* 

Typical  EPROM  access  time 

Power  requirements 

Memory  erase  time 
(2537  A  at  15  W-s  cm-) 


Maximum 

Minimum 

10  V 

4.5  V 

35  V 

17  V 

12  MH/ 

- 

12  MH/ 

)  kH/ 

1  70  ns 

- 

800  mW  at  5  V 

- 

_ 

40  nun. 

Dependent  on  EPROM  access  time  and  mode  of  operation. 


f  igure  3.  PIPE  I  SIC 


single-line  video  as  input  and  displays  processed  results  on  a  standard  TV  monitor  (or  evaluation.  It  is 
constructed  of  13  wire-wrapped  boards,  each  capable  of  containing  50  to  60  integrated  circuits  in  16-pin 
DIPs:  a  one-to-one  correspondence  exists  between  these  boards  and  the  blocks  shown  in  Figure  4.  The 
analog-to-digital  converter  (ADC)  digiti/es  the  incoming  video  to  8  bits  for  further  digital  processing. 

Images  composed  of  512  by  512  8-bit  pixels  are  formed  in  a  frame  buffer  and  processed  at  frame 
rates  determined  by  the  PIPE  LSIC  throughput,  bulfer  memory  speed,  brassboard  architecture,  and  the 
algorithm  being  computed.  Pixel  processing  rates  for  the  PIPE  l  SIC  are  limited  by  the  EPROM  access 
time. 
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Eight  PIPE  LSICs  are  used  in  the  brassboard.  F  or  the  3  X  3  differential  edge  detectors,  four  PIPE 
LSICs  calculate  the  horizontal  response  while  the  other  four  PIPE  LSICs  calculate  the  vertical  response. 
The  template-matching  edge  detectors  also  require  eight  PIPE  LSICs  for  maximum  throughput,  with  one 
template  assigned  to  each  PIPE.  The  output  of  the  PIPE  LSICs  is  processed  according  to  the  operation 
selected  (e.g.,  calculate  the  magnitude  and  orientation  for  the  differential  edge  detectors  or  find  the 
maximum  response  for  template-matching  edge  detectors).  A  minimum  amount  of  interface  between  the 
user  and  the  PIPE  LSIC  is  required  to  communicate  the  type  of  operations  to  be  performed,  the  format  of 
input  data,  etc.  This  interface  is  implemented  through  a  control  panel  for  the  demonstration  brassboard. 
This  control  panel  uses  a  16-key  calculator-style  keyboard  for  command  entry,  a  24-character  alpha¬ 
numeric  LCD  for  displaying  current  operating  parameters  and  for  prompting  the  user  for  new  parameters, 
and  a  3-digit  thumbwheel  switch  for  entering  threshold  levels.  The  user  commands  the  brassboard  to 
perform  various  operations  by  responding  to  seven  prompts  with  seven  single  keystroke  replies.  These 
prompts  query  the  user  for  operating  parameters  such  as  parallel  or  serial  input.  2's  complement  data.  2’s 
complement  weighting  coefficients,  word  length,  weighting  arrangement  (i.e.,  do  all  eight  LSICs  contain 
the  same  weights?),  postprocessing  operation  (magnitude,  maximum,  or  no  operation),  and  sliding  or 
nonsliding  operation. 

A  display-refresh  memory  and  digital-to-analog  converter  (DAC)  provide  analog  data  in  a  standard 
television-monitor  format.  Both  the  ADC  and  DAC  are  high-speed  (10-MHz).  8-bit,  commercially  avail¬ 
able  components. 

Eight  boards  implement  the  frame  buffer  and  refresh  memories:  four  identical  boards  implement 
the  frame  buffer  memory,  and  four  identical  boards  implement  the  refresh  memory  .  The  buffer  and 
refresh  memory  boards  are  essentially  identical,  differing  only  slightly  in  the  write  and  output  sections. 
Both  memories  store  a  complete  video  frame  (512  X  512  X  8)  and  are  designed  with  16K  X  1  dynamic 
random-access  memories  (RAMs).  Each  board  holds  two  bit  planes,  i.e.,  512  X  512  X  2  bits.  The  frame 
buffer  memory  is  designed  to  provide  vertically  related  pixels  from  three  adjacent  lines  simultaneously  as 
inputs  to  the  PIPE  LSICs  for  parallel  mode  operation.  Data  are  available  also  in  serial  format  from  a 
single  port  for  serial-mode  operations.  Although  this  memory  captures  data  from  the  ADC  at  a  10-MHz 
rate  (100  ns/pixel),  a  special  demultiplexing  scheme  permits  use  of  memories  with  slower  access  times. 
The  refresh  memory  has  two  functions:  accept  data  from  the  postprocessing  electronics  and  provide 
digital  words  to  the  DAC  for  reproducing  analog  video.  These  functions  cannot  be  performed  at  the  same 
speed  but  are  synchronized.  Each  memory  cycle  is  designed  to  read  and  provide  time  for  a  possible  write 
if  data  are  available  from  the  postprocessing  electronics. 

An  LSIC  input  controller  and  LSIC  output  controller  control  the  eight  PIPE  LSICs  in  the  dem¬ 
onstration  brassboard.  The  input  controller  uses  the  input  strobe  pulse  and  a  BCD  value  of  the  number  of 
input  strobes  between  the  parallel-to-serial  conversions  to  generate  a  load  pulse  for  the  PIPE  LSIC  shift- 
register  controller.  Corresponding  inputs  of  the  eight  PIPE  LSICs  arc  connected.  A  counter  generates  a 
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load  pulse  to  each  LSIC  after  the  correct  number  of  input  strobes.  Thus,  the  devices  can  be  loaded  in 
parallel  or  sequentially  in  various  combinations.  For  differential  edge  detectors,  the  first  four  PIPE  LSICs 
calculate  the  horizontal  response  while  the  last  four  PIPE  LSICs  calculate  the  vertical  response;  therefore, 
the  first  and  fifth  PIPE  LSICs  need  the  same  data,  the  second  and  sixth  PIPE  LSICs  need  the  same  data, 
etc.  For  calculating  transform  coefficients  and  template-matching  edge  detectors,  all  eight  PIPE  LSICs 
must  be  loaded  with  the  same  data,  i.e.,  have  the  same  load  pulse. 

The  outputs  of  the  eight  PIPE  LSICs  are  controlled  by  the  LSIC  output  controller  that  selects  any 
one,  any  pair,  or  all  the  outputs  for  postprocessing.  The  DATA  VALID  output  of  the  PIPE  LSIC  multi¬ 
plexes  the  outputs  of  the  eight  PIPE  LSICs  in  much  the  same  way  that  the  load  pulse  demultiplexed  the 
inputs;  thus,  their  outputs  can  be  loaded  into  latches  either  sequentially  or  in  pairs.  For  operations  in 
which  the  outputs  are  valid  simultaneously,  the  LSIC  output  controller  multiplexes  the  outputs  into  one 
latch. 

A  limited  amount  of  postprocessing  electronics  is  provided  on  the  demonstration  brassboard  to 
combine  outputs  of  the  PIPE  LSICs.  To  complete  the  magnitude  and  orientation  calculation  of  the 
differential  edge  operators,  the  ability  to  sum  the  absolute  values  and  calculate  a  3-bit  approximation  to 
the  direction  vector  is  provided.  The  ability  to  find  the  maximum  of  eight  inputs  is  provided  for  deter¬ 
mining  edge  orientation  using  the  template-matching  edge  detectors.  A  variable  threshold  permits  evalua¬ 
tion  of  the  various  edge  detectors.  Also,  for  some  types  of  operations,  postprocessing  is  not  required; 
therefore,  an  option  to  bypass  the  postprocessing  function  is  provided. 

The  results  of  using  the  PIPE  demonstration  brassboard  to  implement  a  3  X  3  low-pass  filter,  a 
Sobel  differential  detector,  and  a  5-level  template-match  edge  detector  appear  in  Figure  5.  The  technical 
aspects  of  the  PIPE  LSIC  development,  the  demonstration  brassboard  system,  and  experimental  results 
are  discussed  in  Section  II.  and  a  more  detailed  discussion  of  the  PIPE  LSIC  design  and  operational 
characteristics  can  be  found  in  the  Phase  I  final  report.* 

*T.F.  Check.  W'.L.  Eversole.  and  J.F.  Sal/man.  "Programmable  Image  Processing  Element."  Final  Report.  Contract  No. 
F336 1  S-W-l  763  (November  1980). 


SECTION  II 

TECHNICAL  DISCUSSION 


Texas  Instruments  has  fabricated  and  evaluated  a  programmable  image  processing  element  (PIPE) 
large-scale  integrated  circuit  (LSIC)  and,  to  demonstrate  it,  has  developed  a  flexible  brassboard  capable  of 
operating  in  real-time.  Details  of  the  PIPE  LSIC  development  and  the  design  of  the  demonstration 
brassboard  are  discussed  in  this  section. 

A.  PIPE  LSIC  DEVELOPMENT 

The  objective  of  the  PIPE  LSIC  development  was  to  fabricate  the  PIPE  LSICs  using  the  photomasks 
generated  in  Phase  I  (Contract  F336I5-79-C-1763)  and  perform  functional  evaluation.  To  correct  any 
device  errors  found  during  initial  testing,  time  in  the  Phase  II  schedule  was  included  for  redesign, 
photomask  generation,  and  testing. 

1.  Phase  II  First-Pass  Results 

During  Phase  II,  PIPE  LSICs  were  processed  using  photomasks  produced  during  Phase  I  PIPE 
development.  The  PIPE  LSICs  were  processed  in  DMOS  II  (Dallas  MOS  Front  End  No.  2)  using  the 
standard  25XX,  5-^m  EPROM  process.  In  January  1981, 20  slices  were  started  through  the  process  flow: 
on  27  February  1981,  17  were  finished.  Three  slices  were  damaged  in  the  process  flow. 

The  PIPE  LSICs  were  probed  to  determine  the  quality  of  the  processing.  The  processing  parameters 
closely  corresponded  to  the  design  models.  Figure  6  shows  the  characteristics  of  a  typical  enhancement 
and  depletion  transistor.  The  designed  threshold  (VT)  was  0.8  V  and  —3.0  V,  respectively. 

Circuit  probing  then  started  utilizing  a  low-capacitance  probe,  which  is  required  in  NMOS  circuit 
probing  owing  to  loading  effects  and  drive  limitations.  Figure  7  is  a  schematic  of  the  FET  probe  that  was 
used  on  the  various  LSIC  circuits. 

Initial  circuit  probing  indicated  circuit  errors.  Checking  the  Calcomp  plots  against  the  schematic  on 
the  suspected  areas  revealed  minor  layout  errors. 

Parallel  efforts  involving  slice-level  testing  were  temporarily  halted  after  the  initial  probe  testing 
showed  circuit  errors.  While  these  errors  prevented  the  LSIC  from  being  completely  functional,  they  did 
not  prevent  detailed  evaluation  of  the  remaining  circuits  in  the  redesign.  Table  3  lists  the  inoperative 
circuits  and  reasons  for  their  failure,  and  the  following  subsection  discusses  them  as  well  as  the  various 
redesign  modifications  in  more  detail. 

2.  Circuit  Evaluation  and  Redesign 

The  most  common  building  block  used  on  the  PIPE  LSIC  is  the  MOS  latch,  so  the  input  latch 
structure  was  the  first  circuit  evaluated. 

Figure  8  is  a  block  diagram  schematic  and  oscilloscope  photograph  of  a  functional  PIPE  LSIC  latch. 
The  input  is  TTL  level  (approximately  2.4  V),  and  the  latched  output  is  MOS  level  (approximately  5  V). 
Both  the  nine  8-bit  input  latches  and  the  parallel-to-scrial  shift  registers  use  this  standard  MOS  latch. 
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Figure  6.  First-Pass  P1PF.  1JSIC  I  ransistur  Characteristics 


TABLE  3.  FIRST-PASS  PIPE  LSIC  CIRCUIT  ERRORS  AND  CAUSES 


Inoperative  Circuits  Failure  Cause 


Controller  NOR  buffers 
Clock  driver 
Tristate  output  buffer 
Shift  and  accumulate 
Controller 


Layout  error 
Layout  error 
Layout  error 
Design  error 
Design  error 


OUTPUT 


Figure  9  shows  the  operation  of  a  para!lel-to-serial  shift  register.  The  8-bit  data  on  the  left  was  input 
to  the  serial  port  of  the  PIPE  LSIC  and  strobed  serially  through  the  input  latches  and  converted  into  bit 
serial  form  for  addressing  the  EPROM.  This  demonstrated  the  functionality  of  the  input  section  (input 
latches,  multiplexers,  and  parallel-to-serial  shift  registers).  However,  circuit  evaluation  of  a  possible 
design  problem  appeared.  Critical  timing  between  the  strobe  and  load  pulse  indicated  that,  if  a  strobe 
occurred  during  the  parallel  load,  data  transfer  into  the  parallel-to-serial  shift  registers  could  be  incorrect. 
To  prevent  a  strobe  during  the  data  transfer  operation,  a  strobe  inhibit  circuit  [referred  to  as  data  too  fast 
(DTF)  circuit]  was  added.  The  implemented  logic  function  used  for  the  DTF  circuit  is  shown  in  Figure 
10.  When  the  DTF  control  line  is  low,  the  parallel  load  pulse  is  passed  to  NOR2  and  is  NORed  with  the 
STROBE  pulse.  When  DTF  is  high,  the  STROBE  circuitry  is  unaffected  by  the  load  pulse. 

The  next  section  to  be  evaluated,  the  PIPE  LSIC  512  X  512  bit  EPROM,  was  found  to  be  fully 
functional. 
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LATCH 


Figure  8.  NMOS  latch  Circuit 


NOR  1 


PARALLEL 

LOAD 

Figure  10.  PIPK  I.S1C  Strobe  Inhibit  Circuit 


Before  programming,  the  EPROM  was  erased  by  exposing  the  IC  to  a  high-density  ultraviolet  light 
(wavelength  of  2537  A).  (After  erasure,  all  bits  are  in  a  logic  high  state  and  logic  lows  are  programmed 
into  the  desired  locations.)  The  EPROM  was  programmed  by  raising  the  programming  voltage.  Vpp.  to 
+  25  V  for  50  ms.  (By  raising  Vpp,  the  address  multiplexer  is  forced  to  select  the  three  MSBs  from  each  of 
the  three  input  words  A,  B.  and  C  to  form  the  9-bit  address  to  the  EPROM.)  Data  to  be  programmed  was 
applied  to  the  tri-state  buffer  bond  pads.  The  addresses  and  data  were  changed  and  the  programming 
process  repeated  several  times. 

Data  was  read  from  the  EPROM  by  enabling  the  program  control  line  (logic  level  1)  via  a  special 
bond  wire  connection,  thus  providing  a  direct  path  for  the  memory  address  lines  at  all  times.  Data  was 
read  from  the  internal  2X2  mil2  test  pads  at  the  output  of  the  memory  preceding  the  shift-and- 
accumulate  circuitry.  These  pads  were  removed  during  redesign  in  an  attempt  to  improve  memory 
performance. 

Utilizing  a  special  low-capacitance  probe,  access  times  were  measured  at  two  widely  separated  bit 
locations  of  the  EPROM,  as  shown  in  Figure  1  1.  (Access  times  are  dependent  on  bit  location  in  the 
EPROM  because  of  capacitive  loading  of  the  address  and  bit  select  lines.)  Figure  12  shows  access  times 
measured  at  two  locations.  The  bit  location  with  the  shortest  address  and  bit  select  lines  had  an  access 
time  of  100  ns.  while  the  location  farthest  from  the  address  buffers  had  the  longest  address  and  bit  select 
lines,  resulting  in  access  time  of  180  ns.  Another  important  parameter  affecting  access  time  is  the  thresh¬ 
old  of  the  EPROM's  floating  gale  storage  transistors.  Although  the  measured  thresholds  were  w  ithin  the 
acceptable  range  for  operation,  they  were  slightly  high,  resulting  in  slower  access  times. 
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Figure  II.  PIPF  512  '  512  FPROM  Memory  Configuration 


Figure  12.  PIPE  EPROM  Access  Times 
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ACC 


M 3  N,  M4 

Figure  13.  I  vio-Bit  Dynamic  Adder  Used  in  PIPE  I .SIC  Shift-and-Accumulate  Circuitry 


The  access  time  was  somewhat  longer  than  expected,  and  lower  transistor  thresholds  were  sought  on 
the  second-pass  processing.  Another  step  to  improve  access  time  with  minimal  increase  in  power  assump¬ 
tion  was  to  increase  the  size  of  the  transistors  in  the  memory  drivers  from  the  parallcl-to-serial  shift 
register. 

Next,  the  shift-and-accumulate  circuitry  was  evaluated.  The  full  adders  and  the  carry-sum  latches 
proved  to  be  fully  functional,  but  a  design  error  was  found  in  the  dynamic  adders.  Figure  1 3  shows  the 
2-bit  dynamic  adder. 

During  shift-and-accumulate  operation,  a  latch  accumulate  (LATCH  CC)  pulse  precharges  the 
capacitive  nodes  N„  and  N,  to  a  logic  high  through  transistors  M,  and  M„  respectively.  However,  because 
of  transistor  leakage,  the  two  nodes  discharged  slightly  when  the  LATCH  CC  pulse  went  low.  A  solution 
was  easily  derived:  the  drains  of  the  precharge  transistors  were  connected  to  the  VDD  rail  instead  of  the 
LATCH  CC  line;  in  this  configuration,  less  current  is  supplied  by  the  LATCH  CC  pulse  and  any  leakage, 
as  long  as  it  is  small,  enhances  the  precharge  node  voltage.  The  new  design,  which  proved  itself  in  the 
second-pass  results,  is  illustrated  in  Figure  14. 

Next  examined  in  detail  was  the  shift  register  controller.  Two  layout  errors  were  discovered.  First, 
the  TTL  clock  buffer  and  the  NOR  gate  buffers  had  nodes  that  were  incorrectly  connected.  Second,  under 
certain  power-up  conditions  during  computer  simulation,  the  output  of  latch  SR„  could  have  a  logic  1 
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l  inure  14.  I  wo-Bil  Dynamic  Adder  Design  Chance 

while  outputs  of  ihc  data  latches  (D„  through  D.  leach  have  logic  0.  This  places  a  logic  1  on  the  clear  input 
of  data  latch  DL„.  producing  a  logic  I)  on  its  output.  The  LOAD  input  line  would  become  inoperative 

because  of  the  logis  I  on  the  clear  input  latch  Dl . and  the  shift  register  controller  would  be  in  a  stable. 

yet  invalid,  operating  mode.  To  alleviate  this  potential  problem,  a  power-on  reset  function  was  designed 
into  the  controller.  Figure  Is  is  a  partial  diagram  of  the  controller  with  the  addition  of  the  power  reset 
function. 

Next,  layout  errors  in  the  tristate  output  butler.  NOR  bus  driver,  and  clock  driver  were  corrected 
and  a  complete  detailed  computer  simulation  made  on  each  section.  It  was  discovered  that  not  all  stray 
capacitance  was  used  in  the  initial  simulations  during  Phase  I  and  that  this  stray  capacitance  degraded  the 
overall  speed  performance  although  a  10-Mil/  input  data  rate  was  still  achievable.  Design  errors  were 
corrected  and  computer  simulation  indicated  operational  circuits. 

As  indicated  in  I  able  4.  7  masks  of  the  original  12  were  changed  to  correct  defects. 

3.  t  echnology  l  sed  in  Redesign 

The  front-end  loading  ol  DMOS  11  caused  a  slip  in  schedule  for  reprocessing  the  PIPL  l.SK  but  also 
provided  the  time  necessary  to  use  a  new  design  aid.  The  new  tool  is  actually  computer  software,  which 
replaces  visual  and  manual  plot  checking.  I  he  program  consists  of  two  parts.  The  first  routine  checks  the 
actual  layout  against  predefined  layout  rules.  For  example,  a  metal  line  contacting  a  polysihcon  line  must- 


TABLE  4.  PIPE  MASK  LEVELS 


Mask  No. 

Let  el  Name 

Second-Fass 
Mask  Changes 

817-256 

Inverse  moat 

* 

817-257 

Metal 

* 

817-258 

Oxide  removal 

817-259 

N  implant 

817-260 

P'  implant 

817-261 

Depletion  implant 

* 

817-262 

Natural  implant 

817-26.) 

Contact  2,  Coat  1 

* 

817-264 

Contact  2.  Coat  2 

* 

817-265 

Polysilicon.  Level  1 

* 

8 1 7-266 

Polysilicon.  Level  2 

817-267 

Contact  2.  Coat  3 

* 

* 


meet  certain  guidelines.  Figure  16  shows  a  simple  example  lor  the  5-^m  technology  used  on  the  PIPE 
LSIC;  this  is  a  jumper  where  a  signal  is  passed  over  another  line.  This  requires  a  minimum  of  six  rule 
checks  on  spacing  and  overlap,  with  each  rule  set  defined  by  the  particular  technology  used. 

The  second  part  of  the  verification  involves  the  actual  "circuit."  Each  circuit  is  described  by  its 
transistor  and  node  connection  makeup.  This  is  referred  to  as  the  HDL  description  (hardware  description 
language).  Figure  17  shows  an  example  of  the  TTL  dock  driver  circuit  and  its  HDL  description. 

A  description  similar  to  that  shown  in  Figure  1 7  is  then  generated  by  software  using  the  actual 
layout  data.  The  two  HDL  descriptions  are  then  compared  for  a  match.  If  an  error  has  occurred,  a  conflict 
message  is  generated  and  the  user  is  notified  of  the  error  via  computer  printout.  Figure  18  is  an  example 
of  the  verification  routine  and  software  flow. 

In  defining  the  HDL  description,  a  complete  block  description,  along  with  signal  names,  must  be 
generated.  Figure  19  illustrates  an  HDL  block-level  description  of  the  PIPE  LSIC. 

The  layout  schematic  verification  shown  in  Figure  17  was  used  on  the  PIPE  final  layout.  There  were 
no  catastrophic  layout  errors.  Approximately  5.000  transistors  were  processed  in  the  HDL  software 
routine.  A  transistor  si/e  error  in  the  clock  driver  was  discovered  during  schematic  verification  but 
computer  analysis  showed  no  significant  performance  degradation.  No  other  errors  were  found. 

In  anticipation  of  receiving  processed  parts,  the  slice-level  test  elTort  was  restarted  using  the  Texas 
Instruments  advanced  components  tester  (ACT).  The  first  test  involved  continuity  and  power  supply 
checks.  Next,  the  controller  was  checked  for  data  valid  output  involving  clock  driver  and  controller 
functionality.  Finally,  the  EPROM  was  tested:  all  bits  were  checked  for  an  erased  state,  random  pattern 
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Figure  16.  layout  Rule  Check 


verification  made,  and  all  bits  programmed  to  a  low.  Devices  passing  the  slice-level  probe  were  scribed, 
broken,  and  bonded  into  64-pin  dual  in-line  packages.  Detailed  evaluation  was  performed  using  all  I/O 
pads  on  the  PIPE  LSIC. 

4.  Second-Pass  Results 

Second-pass  processing  of  the  PIPE  LSICs  was  completed  on  28  January  1982. 

Before  application  of  the  protective  overcoat,  two  slices  of  LSICs  were  obtained  for  preliminary 
evaluation.  Figure  20  shows  the  transistor  characteristics  of  an  enhancement  and  depletion  mode  transis¬ 
tor  from  the  preliminary  slice  evaluation.  The  measured  thresholds  were  +0.85  V  and  —2.7  V;  the 


BLOCK  CLOCK; 

(*  PIPE?  CLKI  CLOCK  DRIVER  *  ! 

CLK I  @  INPUT; 

(CLKB.  CLK)  @  OUTPUT; 

STRUCTURE 

MO  I  :  NP  C4.  CLKI.  GND.  BULK.  10.  0.2; 

M02 :  ND  VDD,  C4.  C4,  BULK.  1.6.  0.4; 

M03  ■  NP  C5,  CLKI.  GND.  BULK.  16.  0.2; 

M05  1  ND  VDD,  C 4,  C5.  BULK.  8,  0.4, 

M06 :  NP  CLKB.  CLKI,  GND.  BULK,  120.  0.2; 
M07  :  ND  VDD,  C5,  CLKB,  BULK,  20,  0.3; 
M08.NP  C8 ,  C5,  GND.  BULK.  2.  0.2; 

M09  :  ND  V  DD,  C8 ,  C8,  BULK,  0.8,  0.4; 
M10:NP  C9,  C 5,  GND.  BULK,  4.  0.2; 

Mil:  ND  VDD,  C8 ,  C9.  BULK,  2.  0.4; 

M  1 2  :  NP  CLK,  C5,  GND.  BULK,  50.  0.2; 
M13:ND  Vdd,  C9,  CLK.  BULK.  20.  0.3; 

END  CLOCK; 

BLOCK  NP  GENERIC; 


Figure  17.  Clock  Block  Schematic  (left)  and  Description  (Bight) 
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Figure  18.  Verification  Software  Flow 


Figure  20.  PIPE  I -SIC  Transistor  Characteristics 
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design  values  were  0.80  V  and  3.0  V.  respective!).  During  the  second  pass.  then,  the  thresholds  were 
slightly  closer  to  the  design  values  than  during  the  first  pass.  Figure  21  is  a  microphoto  ol  the  redesigned 
PIPE  LSIC  . 

Each  section  was  evaluated  starting  with  circuits  in  which  there  had  been  design  errors.  First,  the 
TTL-to-MOS  clock  driver  was  investigated.  Figure  22  shows  actual  operation  of  the  clock  driver  at  10 
MFI/.  The  waveforms  appear  somewhat  degraded  because  of  test  setup  and  probe  parasitics. 

The  next  section  evaluated  was  the  shift  register  controller.  Operation  at  10  MH/  is  shown  in  Figure 
23.  The  controller  generates  the  parallel  load,  clear  accumulate,  latch  accumulate,  and  data  valid.  The 
clock  is  the  on-chip  clock  and  the  load  is  eternal  to  the  chip.  The  power  reset  and  the  many  NOR  butfers 
used  in  the  circuit  were  functional.  A  small  problem  was  found  with  the  data  valid  buffer.  Series  resis¬ 
tance  in  the  output  line  prohibited  a  full  T  I  L  output  level.  However,  using  a  pullup  resistor  or  bonding 
closer  to  the  driv  er  improved  the  integrity  of  the  pulse. 

Next,  operation  of  the  tristate  output  latch  was  checked  and  found  fully  functional.  Figure  24  shows 
an  enabled  Instate  output  with  the  input  low  and  instate  pulse  as  input.  Note  that  the  tristate  output 
buffer  is  of  the  inverting  type. 

The  shift  and  accumulate  was  then  checked.  With  the  output  of  the  PIPE  memory  erased  to  all 
digital  l's  and  the  controller  set  for  8-bit  operation,  the  controller  was  run  and  the  shift  and  accumulate 
checked.  The  easily  calculated  result  agreed  w  ith  the  output  of  the  shift  and  accumulate  and  prov  ed  that 
the  design  errors  had  been  successfully  corrected. 

Next,  the  input  section,  which  had  proved  operational  on  first  pass,  was  evaluated  in  detail.  Figure 
25  shows  its  operation  at  a  10-MH/  clock  rate.  There  are  nine  input  latches  with  multiplexers  to  allow 
either  parallel  or  serial  input  data  and  nine  parallel-to-scrial  shift  registers  to  convert  the  bit  parallel 
words  into  bit  serial  format  for  addressing  the  memory.  By  operating  the  PIPE  LSIC  in  the  serial  mode 
and  strobing  the  input  latches  nine  times,  the  8-bit  data  at  the  serial  port  (10  10  1  1  0  0  for  this  case)  is 
clocked  through  all  the  input  latches  and  multiplexers.  When  the  load  pulse  goes  high,  the  bit  parallel 
word  is  converted  into  a  bit  serial  word  beginning  with  the  MSB. 

The  last  circuit  to  be  evaluated  was  the  PIPE  EPROM.  The  memory  which  was  functional  on  the 
first  pass,  thus  precluding  redesign,  was  checked  for  access  limes,  which  is  important  to  the  overall 
operational  speed  of  the  PIPE  LSCI.  Worst  case  measurements  were  made  and  the  results  are  indicated  in 
Figure  2b  that  shows  an  address  pulse  with  associated  memory  output.  In  addressing  a  high  bit.  the 
memory  responded  in  50  ns:  when  addressing  a  low  bit.  however,  the  time  increased  to  170  ns.  This 
window  is  dependent  on  the  memory  cell  (FAMOS)  threshold.  Ideally,  the  high  and  low  access  times 
should  be  identical,  which  would  yield  an  overall  access  time  of  !  10  ns. 

Final  testing  of  the  PIPE  LSIC  was  as  a  whole  unit.  All  circuit  blocks  functioned  together,  and  all 
timing  proved  correct.  Erasure  and  reprogramming  of  the  EPROM  was  confirmed.  An  operating  fre¬ 
quency  slightly  above  10  MH/  was  demonstrated  by  all  sections  except  the  EPROM,  which  was  slower 
than  expected. 
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Figure  26.  Memory  A  cress  I  inics 


The  L. SIC's  demonstrated  power-supply  sensitivity  alter  initial  operation.  Operation  was  enhaneed 
by  running  the  V, „ ,  supply  at  5.5  V.  This  voltage  sensitivity  problem  exists  in  the  EPROM  aeeess  times. 
Data  appear  at  the  output  of  the  EPROM  at  slightly  different  times  because  of  the  bit  path  lengths. 
Operating  at  a  higher  voltage  improves  overall  memory  access  time  and  eliminates  missing  bits  by 
effectively  changing  the  threshold-lev  el  detection  point  of  the  sense  amplifier. 

5.  Processing  and  Packaging 

Processing  was  begun  on  12  December  1981  on  I  5  slices.  1 2  of  which  were  successfully  finished  on 
28  January  1982.  Probe  yield  was  good,  averaging  around  5<i  percent.  Overall  packaged  device  yield 
showed  a  10-percent  loss,  resulting  in  a  40  percent  overall  slicc-lo-package  yield.  This  high  yield  was  the 
result  of  the  NMOS  5-gm  process,  which  by  today's  standards  is  old:  J-jitn  processes  are  in  production 
today,  and  2-/am  and  1-gm  processes  are  in  preproduction  development. 

A  64-pin  hermetic  package  with  a  I  V  lid  for  memory  erasure  was  used  lor  the  PIPE  L.SIC  s.  Figure 
27  is  a  picture  of  the  final  packaged  part.  The  PIPE  I  SI(  measures  240  -  2'T)  mil  and  contains  more 
than  I  1.000  MOS  transistors. 

6.  Summary 

The  objective  of  the  PIPE  program  was  the  design  and  development  of  a  general-purpose,  pro¬ 
grammable.  digital  integrated  circuit  capable  of  multiple  signal-processing  functions. 

Over  the  last  20  years,  the  semiconductor  industry  has  progressed  steadily  in  its  effort  to  achieve 
greater  capability  from  semiconductor  and  other  dev  ice  technologies  at  lower  costs.  Products  of  this  effort 
include  increased  functional  densities  (more  capability  in  smaller  volume),  improved  performance  power 
ratios,  higher  processing  throughput  rates,  improved  reliability,  ami  many  more.  In  1978  when  the  PIPE 
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l.SK  dev elopmcnt  began.  s  mu  technology  was  me  stale  ol  tlu  art  m  pioduclmn  technology  Since  then. 
,'-/u m  technology  has  moved  to  the  piodiution  lines  and  I -am  technology  is  heme  developed  l  sing  the 
technology  that  is  available  lodav.  the  oveiall  operating  speed  ot  the  I’ll’l  I  sK  could  be  significantly 
enhanced. 

I’he  I’ll’l  l.SK  development  demonstiales  the  feasibility  of  using  more  than  1  I. 'ton  MOS  transis¬ 
tors  in  implementing  a  ROM  accumulate  <K  V  1  processor  on  a  single  I  S I  (  It  provides  a  usi  r ■-oriented 
vehicle  tor  testing  various  image-processing  algorithms,  I  he  present  *  am  I  I’KOM  version  ol  t he  I’ll’l 
I  SI(  opertdes  at  a  maximum  clock  rate  of  S  M  !  1/.  u  bun  is  mi  mg  iota  1 1  v  to  I  I ‘ROM  access  times  I  sing 
2-  to  .i-am  technology  and  replacing  the  I  I’KOM  with  ROM  could  result  m  a  1*1 1  ’I  I  kl(  vapable  ol 
20-MH/  clock  rates  I  able  s  sumnniri/es  chip  perloi  numee  and  powc:  requirements.  I  able  (>  is  a  pinout 
assignment  list  lor  the  I’ll’l  I  Sit  (data  inputs  \.  B.  and  (  me  designated  In  I’.  1  .  and  S.  respectively,  to 
av  oid  confusion  with  the  I  'I’KOM  programming  uddressi.  ami  table  '  lists  the  controls 
B.  PIPE  EPROM  PROGRAMMER 

lo  program  and  orvenlv  ther'll’l  I  Sit  I TKOM  .1  piorrammei  capable  ol  utter lac  mg.  to  a  I  hg.ia! 
I  igupmem  (  orporation  I’l  >1*  li  eomputei  v  ia  a  i  )R  I  I  1  miciku  m  .1  lev. is  Inspuments  nui  i  It)  com¬ 
puter  through  a  l<’-bit<  kl  I  1  >  into  lave  vws  designed  aiul  !  ibi icaud  In  the  smiplihed  bus  k  diagiam  ol 
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Maximum 


Minimum 


Maximum  operating  voltage 

It)  V 

4.5  V 

Maximum  programming  voltage 

35  V 

17  V 

Maximum  strobe  frequency 

12  MHz 

- 

Maximum  clock  frequency* 

12  MHz 

1  kHz 

Typical  EPROM  access  time 

170  ns 

- 

Power  requirements 

800  mW  at  5  V 

— 

Memory  erase  lime  (2537  A  at  1 5  VV-s  cm  )  —  40  minutes 

•Dependent  on  [{PROM  aeeess  time  and  mode  of  operation. 


bit  0  (C'SRO)  and  bit  1  ((  'SR ! ).  respectively,  of  the  control  and  status  register.  These  bits  can  be  loaded  or 
read  from  the  UNIBUS.  The  control  signal  NEW  DATA  is  a  400-ns-wide  positiv  e  pulse  generated  by  the 
OR  1 1— C '  interface  when  information  is  loaded  into  the  OUTPUT  BUFFER  REGISTER  from  the 
UNIBUS.  Ifa  16-bit  CRU  I  O  interface  is  used,  the  user  must  generate  the  control  signals  NEW  DATA. 
(  SI.  and  US2. 

The  programmer  has  two  modes  of  operation  — program  and  verify  — that  are  determined  by  the 
state  of  the  control  signal  ('ST  Since  the  PIPE  EPROM  ts  organized  as  512  X  12.  programming  requires 
12  bits  of  data  and  9  bits  of  address.  The  controller  uses  some  of  the  16-input  data  lines  (ODO-OD15). 
some  of  the  16-output  data  lines  (ID0-ID15K  and  some  of  the  PIPE  LSK  outputs  during  programming 
and  verification. 

To  program  a  word  into  the  PIPE  LSIC  EPROM,  address,  data,  and  programming  voltage  are 
required.  The  program  mode  is  defined  by  a  logical  high  state  on  control  signal  CS2:  special  power-on 
circuitry  is  included  in  the  control  logic  to  force  the  output  of  the  US2  buffer  to  a  low  state,  thus 
preventing  accidental  programming  of  an  EPROM  location  when  power  is  initially  applied. 

To  load  the  9-bit  EPROM  address  into  the  address  latches,  the  address  word  is  output  from  the  host 
computer  on  lines  ODO-OD8.  The  MSB  oflhe  16-bit  output  lines.  OD1 5.  is  also  set  high  and  used  by  the 
control  logic  in  conjunction  with  the  NEW  DATA  control  signal  to  clock  the  address  into  the  address 
latch.  The  address  is  latched  into  the  address  latches  on  the  high-to-low  transition  of  the  NEW  DATA 
control  signal.  The  US2  is  used  to  enable  the  address  latch.  During  the  program  mode,  the  address  latch  is 
constantly  enabled. 

After  the  address  has  been  latched,  the  12-bit  EPROM  data  is  latched  into  the  data  latch.  This  is 
accomplished  by  setting  data  line  OD15  to  a  logic  low  level,  outputting  the  12-bit  data  on  data  lines 
ODO-ODI I.  and  pulsing  the  NEW  DATA  control  line.  Again,  the  data  is  latched  into  the  data  latches  on 
the  high-to-low  transition  of  the  NEW  DATA  control  line.  Control  signal  CS2  is  also  used  to  enable  the 
data  latch;  therefore,  during  the  program  mode,  the  data  latch  is  constantly  enabled. 
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TABI.E  7.  COM  ROE  DEFINITION 


Control  I.ine(s) 


Function 


Word  length  (,3-bit  BCD  code) 

Parallel/serial 

Master  clock 

Load 

Input  strobe 

2's  complement  data 

Enable 

^  ni> 

V  ,,,, 

Data  valid 

2's  complement  coefficients 
DTF 


Defines  word  length  in  bits  of  input  data* 

Determines  mode  of  chip  operation:  3  X  3  or  9  X  I 

A  square-wave  clock  provided  for  system  timing 

Initiates  parallel-to-serial  data  conversion 

Indicates  valid  input  data  and  latches  it  in  input  latches 

Defines  signed  or  unsigned  magnitude  data  operation 

T ristates  or  enables  output  bus 

Single  +5-V  operating  supply 

Normally  at  5  V  but  taken  to  25  V  for  EPROM 
programming 

Output  signal  indicating  complete  computation 

Sets  sign  bits  of  output  word 

Inhibits  input  strobe  during  parallel  load  operation 


*Word  length  of  I  used  for  memory  verification. 
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Figure  29.  PIPE  KPROM  Programmer  Program  Mode 


To  program  an  EPROM  location  defined  by  the  9-bit  address  with  the  data  in  the  data  latches,  the 
programming  voltage  VPI,  is  taken  from  its  normal  5-V  level  to  the  programming  level  of  25  V  for  a 
minimum  of  50  ms.  When  control  signal  C'Sl  is  taken  high.  V,,,.  is  taken  to  25  V.  causing  a  multiplexer 
internal  to  the  PIPE  LSIC  to  select  the  12  LSBs  of  the  PIPE'S  20-bit  output  as  EPROM  data  inputs  and 
the  3  MSBs  of  each  of  the  three  PIPE  inputs  as  EPROM  addresses.  After  50  ms.  CS1  is  taken  low  and  the 
data  word  has  been  programmed.  The  next  address  and  data  can  now  be  latched  and  programmed.  Figure 
29  summarizes  the  EPROM  programmer's  mode  of  operation.  After  all  512  EPROM  words  are  pro¬ 
grammed.  CS2  is  taken  low.  thus  disabling  the  address  and  data  latches.  The  PIPE  LSIC  inputs  and 
outputs  can  then  be  used  to  read  the  contents  of  the  EPROM  to  verify  correct  programming. 


Reading  the  contents  of  the  PIPE  EPROM  is  much  more  difficult  than  programming  because  veri¬ 
fication  requires  dynamic  operation  of  the  PIPE  LSIC.  When  operating  the  PIPE  LSK  in  the  serial  mode 
with  l-bit  input  data,  an  EPROM  address  can  be  strobed  into  the  input  latches  of  the  PIPE  LSK  :  after 
three  clock  cycles,  the  12  LSBs  of  the  PIPE  output  represent  the  data  at  the  EPROM  address  defined  b> 
the  input  data. 

When  control  signal  CS2  is  low.  the  PIPE  programmer  is  in  the  verify  mode  of  operation.  A  parallel- 
in  serial-out  shift  register  within  the  control  logic  hardware  is  loaded  with  the  9-bit  address  on  lines 
ODO-OD8  when  data  line  OD9  is  high  and  control  line  NEW  DA  LA  is  pulsed.  Data  line  OD9  is  then  set 
to  a  low  level,  putting  the  shift  register  in  a  shift  mode.  The  first  of  nine  l-bit  words  is  then  available  at 
the  PIPE  LSK  serial  input  and  strobed  into  the  PIPE  input  latch  by  pulsing  control  signal  (SI.  i.e..  taking 
CSI  high  and  then  low.  (('SI  controls  the  PIPE  L.SK '  strobe  line.)  The  next  1  -bit  word  is  shifted  out  of  the 
shift  register  and  strobed  into  the  PIPE  LSK  by  pulsing  NEW  DATA,  then  pulsing  CSI.  This  procedure 
is  repealed  until  all  nine  l-bit  words  of  the  desired  address  have  been  strobed  into  the  PIPE  input  latches. 
The  PIPE  LSIC  is  issued  a  load  pulse  and  clocked  by  the  control  logic.  Assuming  that  the  programmer's 
clock  is  faster  than  the  host  computer,  the  latter  can  issue  a  read  instruction,  and  the  output  of  the  PIPE 
LSIC  can  be  read.  An  enable  generator  internal  to  the  control  logic  enables  the  PIPE'S  tristate  outputs. 
Data  line  ODK)  is  taken  high,  and  the  PIPE  denoted  by  data  lines  OD1 1-OD13  (i.e..  binary  equivalent 
for  the  PIPE  No.  0-7)  is  enabled  for  the  host  computer  to  verify.  If  the  opeiator  only  wants  to  verify  one 
PIPE  LSIC.  input  data  line  ID  I  3  can  be  set  low  through  a  front-panel  switch  and  only  PIPE  No.  0  will  be 
enabled.  If  eight  PIPE  LSICs  are  to  be  verified,  the  panel  switch  is  set  high  (ID  1 3  is  high),  and  the  enable 
generator  will  sequentially  enable  the  PIPE  LSICs.  beginning  with  device  0. 

After  the  PIPE  device(s)  has  been  verified,  the  next  address  is  loaded  in  the  control  logic  shift 
register  and  the  process  repeated  figure  30  summarizes  the  verification  mode.  After  all  512  EPROM 
locations  have  been  verified,  all  data  and  control  signals  are  taken  low. 

C.  DEMONSTRATION  BRASSBOARD 
I.  Overview 

Io  demonstrate  the  PIPE  l.SK  versatility,  a  flexible  brassboard  operating  in  or  ne-ar  real-time  was 
developed.  1  he  design  goals  of  the  brassboard  were  as  follows: 

•  feature  standard  RS-  170  composite  video  input  and  output 

•  Provide  10-Mil/  data  rate  (5 1  2  X  512  display) 

•  Provide  serial  or  three  parallel  (vertically  sequential)  words 

•  Provide  and  control  signals  (clock,  strobe,  load,  etc..)  to  PIPE  I  SIC 

•  Provide  magnitude,  maximum,  threshold,  and  no-op  postprocessing  functions 
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•  Demonstrate  various  operations 

•  Differential  edge  detector  (Prewitt.  Sobel.  etc.) 

•  Template-matching  edge  detector  (Compass,  kirsch.  etc.) 

•  Low-pass  filter. 

•  Provide  flexibility  to  allow  future  hardware  demonstration. 

The  most  important  goal  was  the  tlexibilitv  to  demonstrate  future  hardware  developments. 

The  demonstration  brassboard.  Figure  31.  is  completely  self-contained,  accepts  single-line  v  ideo  as 
input,  and  displays  processed  results  on  a  standard  TV  monitor  for  evaluation.  The  brassboard  will 
operate  the  PIPF  I  NK  in  both  its  serial  and  parallel  modes  and  demonstrate  its  processing  of 
vector  (ranslorms  and  neighborhood  operators  I  he  analog  video  input  is  quantized  to  8-bit  precision 
and  stored  in  the  liame  butler  memory,  which  forms  the  frame  images  from  the  time  interlaced  video 
fields.  Images  composed  of  s  I  2  by  I J  s  hit  pixels  can  be  processed  at  frame  rates  determined  by  the 
PIPE  LSIC  throughput,  butlei  niemoiy  specif  brassboard  architecture,  and  the  algorithm  being  com¬ 
puted.  Pixel  processing  rates  loi  the  Pll’l  I  SK  arc  limited  by  the  EPROM  access  lime. 

Light  PiPE  1  SK  s  arc  used  m  the  brassboard.  For  the  3  X  3  differential  edge  detectors,  four  PIPE 
LSICs  calculate  the  horizontal  response  while  the  other  four  are  calculating  the  vertical  response.  The 
template-matching  edge  deter  tors  also  require  eight  PIPE  LSIC  s  for  maximum  throughput,  with  one 
template  assigned  to  each  PIPF  I  he  output  of  the  PIPE  LSICs  is  processed  according  to  the  operation 
selected  (e.g..  calculating  the  magnitude  and  orientation  for  the  differential  edge  detectors  or  finding  the 
maximum  response  for  template-matching  edge  detectors).  A  minimum  amount  of  interface  between  the 
user  and  the  PIPE  LSIC '  is  required  to  communicate  the  type  of  operation  to  be  performed,  the  format  of 
input  data.  etc.  1  his  interface  is  implemented  through  a  control  panel  for  the  demonstration  brassboard. 
This  control  panel  uses  a  16-key  calculator-style  keyboard  for  command  entry,  a  24-character  alpha¬ 
numeric  LCD  display  for  display  ing  current  operating  parameters  and  prompting  the  user  for  new  param¬ 
eters.  and  a  3-digit  thumbwheel  switch  for  entering  threshold  levels.  The  uscr.commands  the  brassboard 
to  perform  various  operations  by  responding  to  seven  prompts  with  seven  single  keystroke  replies.  These 
prompts  query  the  user  for  operating  parameters  such  as  parallel  or  serial  input.  2's  complement  data.  2's 
complement  weighting  coefficients,  word  length,  weighting  arrangement  (i.e..  do  all  eight  LSICs  contain 
the  same  weights?),  postprocessing  operation  (magnitude,  maximum,  or  no  operation),  and  sliding  or 
nonsliding  operation. 

A  display-refresh  memory  and  digital-to-analog  converter  prov  ide  analog  data  in  a  standard  TV- 
monitor  format.  Both  the  ADC  and  DAC  are  high-speed  (10-MHz).  8-bit.  commercially  available 
com  ponents. 

The  brassboard  is  constructed  from  off-the-shelf  components  and  standard  wire-wrap  techniques. 
The  brassboard  is  functionally  partitioned  into  13  wire-wrapped  boards,  each  measuring  7.3  X  7.0  in. 
and  capable  of  containing  up  to  60  integrated  circuits  in  16-pin  dual  in-line  packages.  There  is  a  1:1 
correspondence  between  the  boards  and  the  block  diagram  shown  in  Figure  31. 
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Figure  33.  Analog  Module  of  P1PK  Demonstration  Brassboard 


The  DAt'  on  the  brassbuard’s  analog  board  is  actuallv  a  combination  <  DAC  and  composite  sync 
generator.  This  unit,  an  Analogic  MP8308.  accepts  K-bit  digital  data  from  .  fresh  memories'  compos¬ 
ite  blanking  (COMP  BLNK.)  and  composite  sync  (COMP  SYNC)  from  the  toning  board  and  produces 
standard  KS-1 70  composite  video  with  25h  gray  shades  that  will  directly  drive  the  composite  v  ideo  input 
of  a  television  monitor. 

The  use  of  a  commercially  available  ADC  and  DAC  sync  generator  greatly  simplified  the  design  of 
these  functions. 

3.  Buffer  Memory 

The  buffer  memory  stores  a  complete  video  frame  of  two  interlaced  fields,  each  containing  256  lines 
with  5 1  2  pixels  line  The  buffer  memory  is  designed  to  capture  data  from  the  ADC  at  a  100-ns  pixel  rate 
and  to  provide  vertically  related  pixels  from  three  adjacent  lines  simultaneously  as  inputs  to  the  PIPE 
LSICs.  Data  from  memory  is  available  also  in  serial  format  for  demonstrating  the  PIPE  LSICs  serial 
mode  of  operation. 

As  shown  in  the  simple  block  diagram,  f  igure  54.  the  memory  is  partitioned  into  odd-licld  odd  line 
odd-lield  even  line  even-field  odd  line  and  even-field  even  lines.  T  he  memory  design  is  based  on  the 
Intel  2118.  16k  <  1  -bit  dynamic  RAM  (DRAMs).  which  requires  a  single  -+•  5-V  power  supply  and 
dissipates  I  50  rnVV  operating  and  1  I  mW  standby.  The  typical  read  write  cycle  time  of  those  DRAMs  is 
27t)  ns.  but  a  special  demultiplexing  multiplexing  scheme  permits  using  memories  with  access  times  of 
less  than  100  ns.  The  butler  memory  requires  128  (each  16k  X  1)  DRAMs  to  store  an  image  of  512  X 
512  X  8  bits.  The  addition  of  data  registers,  address  generators,  and  multiplexers  to  prov  ide  additional 
versatility  increased  the  number  of  components  for  the  buffer  memory  to  240.  The  wire-wrap  boards 
selected  for  fabrication  of  the  brasshoard  accommodate  60  components:  therefore,  four  boards  (Figure 
35).  each  representing  512  X  512  X  2  bits  of  memory,  are  required. 

Figure  36  shows  more  detail  of  one  buffer  memory  board  (512  X  512  X  2  bits).  Data  received  by 
the  memory  boards  is  input  to  a  4-bit  serial-in  parallel-out  (SIPO)  shift  register  to  reduce  the  required 
10-MH/  data  rate  to  a  2.5-MH/  or  400-ns  memory  cycle  time.  When  the  shift  register  is  full,  all  four 
words  are  written  to  memory  simultaneously.  Control  lines  YO  and  ODDFLD  determine  the  bank  (i.e., 
odd-lield  odd  line  (OFOL).  cven-lield  odd  line  (EFOE).  odd-lield  even  line  (OFEL).  or  even-field  even 
line  ( E EEL ) |  in  which  the  data  is  written.  I  he  seven  address  lines  and  control  lines  Cl.  C2.  C3.  and  C4 
determine  the  address  within  a  given  bank. 

Control  signal  Ol)TLD  loads  4  bits  of  memory  data  into  a  parallel-in  serial-out  (PISO)  shift  register 
to  multiplex  the  memory  outputs.  I  his  is  the  inv  erse  of  the  demultiplexing  of  the  input  data  and  restores 
the  data  rate  to  10  MH/. 

Because  three  adjacent  lines  of  v  ideo  are  required  simultaneously,  three  multiplexers  and  a  register 
are  used  to  select  the  correct  line  of  data  fora  given  output  line  ALN.  BI  N.  CLN.  The  multiplexers  are 
controlled  by  a  4-bit  output  select  word  generated  on  the  timing  and  synchronization  board.  During 
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Figure  1*11*1  Brassboard  Buffer  Meinorx  Boards 


parallel  operations,  the  HI  N  output  is  imsiored  "live  '  video:  during  seiial  operations,  the  (  I  N  output 
represents  unstored  video.  I  he  inemon  hoards  also  require  row  address  strobe  tRAS).  eolumn  address 
strobe  K  \Si  anil  write  l\VRi  eoniiol  hues  to  address  and  write  data  pioperlv  into  the  DRAMs. 

figure  D  shows  eight  lines  of  standard  raster  seat)  video  Heeause  ot  the  time  mterlaeing.  field  A 
data  must  be  w  ri t ten  to  mentors  before  !  u-ld  H  data  I  he  memoi  v  is  partitioned  to  allow  storage  of  four 

1  V  lines  at  the  same  v  address,  i  e  I  \  lines  n.  1 ,  2.  and  '  at  v  address  o  >  ontrol  lines  'i  ()  and  ODDH.I) 
determine  the  speetlie  DR  \M  to  which  data  is  written  I  o:  example  I  \  line  <’  is  stored  at  v  address  0  in 
the  oild  held  odd  line  memories,  which  aie  the  top  row  o!  memories  in  I  igure  also.  I\  line  -1  is 
stored  m  these  memories  at  v  address  I. 

Reading  data  from  the  butler  memoi  \  is  moie  i omples  I  turn ig  a  lead  i operation,  the  \  address  must 
be  modified  to  obtain  three  adjacent  lines  of  I  \  video  Reteri mg  again  to  I  igure  d  data  from  fV  line 

2  is  at  output  HI  N  imistored  video  lor  paiallel  operations),  data  bom  I  \  lines  1  and  '  are  at  outputs 
\l  N  and  <  I  \  respeilivelv  In  tins  ease  no  v  address  moilifu  a'lon  is  noim-surs  because  all  the  data  is  at 
v  adiliess  i >  It  data  from  I  \  In,.-  5  is  at  output  HI  N.  the  v  addu  w  is  a  and  no  modification  is  necessarv 
to  obtain  data  Ifoiii  l\  line  1  at  output  \l  N  lo  obtain  data  liom  If  line  4  at  output  <  I  N.  the  \ 
address  ol  the  <M  t  >|  memorv  bar. I  nim!  be  in.  icon  nted  is.  I  Alien  1\  ime  4  is  at  output  HI  N  a  more 
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Figure  37.  TV  Display 


difficult  situation  arises:  the  y  address  of  the  EFEL  memory  bank  remains  0  and  the  y  address  of  the 
OFOL  and  EFOL  memory  banks  are  incremented  by  !  to  obtain  the  proper  y  address. 

High-speed  adders  implement  modifications  to  the  y  address.  Constants  Cl.  C2.  C3.  and  C4  are 
generated  on  the  tii  ng  board  and  added  to  a  base  y  address  to  create  the  modified  y  address  for  the 
memory  banks. 

4.  PIPE  I. SICs  and  Interface 

The  PIPE  LS1C  is  capable  of  operating  on  9  X  I  or  3  X  3  blocks  of  data,  either  of  which  may  be 
sliding  or  nonsliding.  The  PIPE  LSIC  calculates  the  sum  of  products  for  the  nine  8-bit  data  words  that 
have  been  loaded  into  the  LSIC  input  latches  by  the  LSIC  input  strobe  control  line.  LSIC  control  line 
LOAD  determines  the  data  to  be  processed,  and  the  PIPE  brassboard  must  have  circuitry  to  control  the 
generation  ol  the  LOAD  pulse  for  the  various  types  of  input  data  arrangements.  To  increase  throughput, 
eight  PIPE  LSICs  are  used  in  the  brassboard:  all  inputs  of  the  LSICs  ate  connected,  and  data  is  strobed 
into  all  devices  simultaneously.  An  input  controller  generates  the  appropriate  LOAD  pulses  to  allow  the 
eight  PIPE  LSICs  to  operate  in  parallel,  and  an  output  controller  selects  the  outputs  of  the  devices  for 
postprocessing  functions. 

The  eight  PIPE  LSICs.  input  controller  and  output  controller  are  partitioned  into  two  wirewrap 
boards  (Figure  38).  Each  board  contains  four  PIPE  LSICs  (Figure  39)  and  either  the  input  or  output 
controller.  All  the  data  inputs  are  connected,  as  are  many  of  the  other  control  signals  — word  length.  2's 
complement  data  (  LCD),  clock,  parallel  serial  select,  etc.  Each  PIPE  LSIC  has  a  separate  LOAD. 
ENABLE,  and  D  \TA  V  ALID  control  line.  The  20-bit  outputs  of  the  PIPE  LSICs  are  also  connected. 
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Figure  41.  PIPE  Demonstration  Brassboard  Processing  at  30  Frames  Second  (8-Bit  Data; 
3  x  3  Sliding  Window;  Same  Weights)  Image  Size  Versus  EPROM  Access  l  ime 


the  same  weighting  coefficients  in  each  PIPE,  the  throughput  of  the  brassboard  is  eight  times  the  through¬ 
put  of  a  single  PIPE  LSIC  because  each  PIPE  LSIC  is  loaded  sequentially,  allowing  each  device  to 
calculate  an  answer  for  adjacent  neighborhoods  of  the  image.  For  operations  with  different  weights  in 
each  PIPE  LSIC  (template-matching  edge  detectors),  eight  PIPE  LSICs  are  required  for  maximum 
throughput,  with  one  template  assigned  to  each  PIPE.  For  differential  edge  detectors  (Sobel,  Prewitt,  etc.), 
the  weighting  coefficients  are  arranged  in  pairs:  the  first  four  PIPE  LSICs  calculate  the  horizontal 
response;  the  last  four  calculate  the  vertical  response. 

Because  of  the  PIPE  LSIC  EPROM  access  time  and  the  particular  requirement  of  certain  operations, 
the  PIPE  brassboard  does  not  process  512  X  512  images  at  30  frames/ second.  Figure  41  indicates  the 
effect  of  EPROM  access  time  on  the  maximum  number  of  pixels  per  line,  assuming  30  frames  second. 
Assuminga  100-ns  EPROM  access  time,  the  PIPE  brassboard  could  processa512  X  512imagefor3  X 
3  sliding  window  same  weights  operations  at  30  frames  second.  The  EPROM  access  time  is  approxi¬ 
mately  170  ns.  however,  as  discussed  in  the  PIPE  LSIC  section.  Only  an  image  size  of  5 1 2  X  313  can  be 
processed  at  30  frames  second. 

An  alternative  to  reducing  image  si/e  is  to  reduce  the  frame  rate.  Figure  42  shows  the  frame  rate  as  a 
function  of  PIPE  LSIC  EPROM  access  time  for  the  3  x  3  sliding  w  indow  same  weights  and  3  X  3  sliding 
window  paired  weights  (Sobel). 

To  maintain  the  capability  of  the  brassboard  to  evaluate  future  processors,  the  PIPE  brassboard 
operates  on  512  X  512  images  at  the  frame  rate  dictated  b>  the  EPROM  access  time  and  the  type  of 
operation,  fable  8  show  s  the  number  of  passes  each  operation  must  take  through  a  5 1 2  X  512  image  and 
the  resulting  frame  rate.  The  input  controller  records  the  window  location  on  a  line  and  the  number  of 
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Figure  42.  Frame  l  pdale  Kale  Versus  KPROM  Access  l  ime  for  a  512  X  512  Image 
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passes  made,  to  address  the  timing  PROM  that  generates  the  load  clock.  The  load  enable  generator  and 
the  multiplexers  (Figure  40)  provide  different  modes  of  input  data  demultiplexing  to  accommodate  trans¬ 
form  coefficient  caleulations  and  differential  edge  detectors.  To  calculate  transform  coefficients  and 
template-matching  edge  detectors,  all  eight  PIPE  LSICs  must  be  loaded  with  the  same  data;  i.e.,  have  the 
same  load  pulse.  This  is  accomplished  by  enabling  the  second  multiplexer  of  the  input  controller,  thus 
jointly  activating  LOADS  0-7.  For  differential  edge  operators,  the  same  data  is  needed  by  the  first  and 
fifth  PIPE  LSICs.  the  second  and  sixth,  etc.  This  pairing  of  LOAD  pulses  is  accomplished  by  the  OR  gates 
and  first  multiplexer  of  the  input  controller. 


TABLE  8.  FRAME  RATES  FOR  V  ARIOUS  OPERATIONS 

ON  512  X  512  IMAGE 

No.  of  Frame  Rate 

Operation 

Passes 

(fps) 

1  nonsliding  window,  same  or  different  weights 

2 

15 

1  sliding  window;  same  weights 

■> 

15 

3  nonsliding  window;  same  weights 

i 

30 

3  nonsliding  window;  different  weights 

6 

5 

3  sliding  window;  same  weights 

2 

15 

3  sliding  window,  different  weights 

16 

1.875 

3  sliding  window;  paired  weights 

4 

7.5 
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Figure  43.  l.SIC  Input  Controller  liming,  3X3  Sliding  Window  Operation 
(Same  Weights  in  AM  PIPE  LSICs) 


Figures  43.  44.  and  45  illustrate  the  LOAD  pulse  (LP)  timing  for  the  3X3  sliding  window  with 
same,  different,  and  paired  weights,  respectively.  A  two-phase  PIPE  LSIC  clock.  PCLKE  and  PCLKO. 
prevents  certain  pixels  from  being  ignored  as  a  result  of  the  difference  in  the  strobe  frequency  (10  MH/) 
and  the  PIPE  LSIC  clock  rate  (5  MHz). 

b.  Output  Controller 

The  outputs  of  the  eight  PIPE  LSICs  are  controlled  by  an  output  controller  (Figure  46),  which  selects 
any  one.  any  pair,  or  all  of  the  outputs  for  postprocessing.  The  output  controller  generates  the  tristate 
enable  timing  for  the  PIPE  LSICs  based  on  the  type  of  LSIC  operation. 
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Figure  44.  I.SK'  Input  Controller  l  iming,  3X3  Sliding  Window  Operation 
(Different  Weights  in  All  PIPK  I.SICs) 
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STAGGERED  PCLKs  ALLOW  EFFECTIVE  OUTPUT  DATA  RATE  OF 
10  MHZ  WHILE  LSICs  ARE  CLOCKED  AT  5  MHZ 

TWO  PASSES  THROUGH  IMAGE  REQUIRED 

Figure  45.  I.SK  Input  Controller  liming,  3  x  3  Sliding  Window  Operation 
(Paired  Weights  Sobel) 


Figure  46.  I  SIC  Output  Controller  PIPF  Demonstration  Brassboard 
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I  he  DMA  \  MID  generated  In  the  I’ll’!  1  Sl(  is  used  to  multiplex  the  outputs  of  the  eight  PI  PH 
I  SI(  x  in  mueh  the  same  wax  the  I  ()\D  pulse  was  used  to  demultiplex  the  inputs,  for  operations  in 
which  the  1  OM)  pulses  are  generated  sequentially .  the  DATA  VALID  pulses  occur  sequentially  and  the 
output  control  multiplexer  selects  the  DA  I  A  VAL  ID  pulses  as  ENABLE  signals  for  the  PIPF.  LSICs.  For 
operations  m  which  the  LOAD  pulses  are  generated  in  parallel,  the  DAT  A  VALID  pulses  also  occur  in 
parallel  and  cannot  he  used  to  enable  the  PI  PI  l.SK  s  because  the  outputs  of  the  four  PIPE  LSICs  on  each 
1  Sl(  module  are  connected  Therefore,  ihc  PIPE  I.SK's  must  be  enabled  sequentiallx.  This  is  done  by 
loading  one  of  the  DAI  A  V  Al  ID  pulses  into  an  eight-stage  serial-in  parallel-out  shift  register,  as  the 
l)\  I  \  \  VI  ID  pulse  is  shifted  doxxn  the  shift  register,  the  outputs  are  used  to  enable  the  PIPE  LSICs 
sequentiallx 

I  he  outputs  ol  each  set  of  four  PIPF  I  SICs  are  loaded  into  S-bit  latches.  (Only  8  bits  of  the  LSICs 
-1’ -bit  output  ate  used  because  the  postprocessing  and  refresh  memory  functions  are  designed  for  8-bit 
i.  i  ut  \  allies  I  Rather  than  using  the  8  LSBs  ol  the  20- bi t  output  of  the  PIPE  LSICs.  outputs  D  through 
I )  are  utilized  I  his  represents  a  dixide-by-4  of  the  output  but  allows  more  efficient  use  of  the  dy  namic 
lunge  In  piexenting  oxerllow  (which  causes  saturation  of  the  postprocessing  electronics)  for  certain 
itnugc-piocessmg  operators,  notably  the  Sobcl  edge  operator  and  tilth-lex  el  template-match  edge  detector. 
Othci  operators'  weights  can  be  adjusted  slightly  to  take  advantage  of  this  increased  dynamic  range. 
( hit  puts  I)  through  I)  are  used  to  indicate  oxerllow  or  underflow',  and  IX,  is  the  sign  bit. 

I  or  the  '  ■  >  sliding  window  paired  weights  operation,  two  8-bit  data  words  (ODHI  and  ODLO. 
one  from  each  set  of  the  four  PIPE  l.SK  s)  are  presented  to  the  postprocessing  functions.  For  all  other 
operations,  the  output  controller's  data  selector  presents  a  single  8-bit  word  (ODALL)  to  the 
postprocessing  circuitry . 

figures  47.  48.  and  44  show  output  controller  timing  for  the  3  X  3  sliding  window  and  same, 
ditlcrcnt.  and  paired  weights,  respectively.  The  DATA  VALID  (DV)  pulses  and  the  corresponding  PIPE 
l.SK  ENABLE  (EN>  pulses  are  shoxvn.  Output  data  from  the  PIPE  LSICs  are  loaded  into  latches  on  the 
falling  edge  of  the  EN  ABLE  pulse. 

5.  Postprocessing 

To  condition  the  outputs  of  the  PIPE  LSICs  for  storage  in  the  refresh  memory  and  for  subsequent 
display  ,  a  limited  amount  of  postprocessing  electronics  is  included  in  the  demonstration  brassboard.  To 
calculate  the  magnitude  of  the  horizontal  and  vertical  response  of  the  differential  edge  operators,  a  simple 
magnitude  function,  a  sum  ol  absolute  values,  is  proxided.  Also  included  is  a  coarse  (3  bits)  direction 
calculation.  Also  proxided  is  the  ability  to  find  the  maximum  of  eight  inputs  for  determining  edge 
orientation  using  template-matching  edge  detectors. 

There  is  also  the  option  to  bypass  the  postprocessing  functions  because  postprocessing  is  not 
required  in  some  applications.  Additionally,  an  operator-controlled  threshold  is  proxided  to  permit  exal- 
uation  of  various  thresholds  on  ditlcrcnt  image-processing  algorithms.  The  operator  selects  a  threshold 
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Figure  47.  I, NIC  Output  Controller  liming.  }  K  }  Sliding  Window  Operation 
(Same  Weights  in  VII  IMPK  I. SIC’s) 


(0-255).  and  decides  if  values  above  or  below  the  threshold  value  are  to  be  passed  directly  to  the  refresh 
memory:  d  values  above  the  threshold  are  to  be  passed  unmodified  to  the  refresh  memory,  all  data  values 
below  the  threshold  will  be  set  to  0.  if  values  below  the  threshold  are  to  be  passed  to  the  refresh  memorv. 
all  values  above  the  threshold  will  be  set  to  255  I  he  postprocessing  also  has  circuitry  to  pass  255  for 
negative  overflow  conditions  when  the  operator  selects  the  magnitude  function:  when  other 
postprocessing  (unctions  are  selected,  a  0  is  passed  for  negative  input  values  !  his  prevents  the  display  of 
negative  pive!  values  as  positive  values  on  the  monitor. 

1  he  postprocessing  boaui.  winch  also  contains  part  ot  the  r*  fresh  memorv  controller  circuitry.  is 
shown  m  I  iguic  5 1 * . 
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Figure  4H.  I  .SIC  Output  (  ontroller  liming.  .4  <  3  Sliding  W  indow  Operation 
(Different  Weights  in  Each  PIPE  I.SIC) 
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f  igiire  49.  I  Sit  Output  t  nntrnller  I  inting.  3  ■  3  Sliding  \\  indo»  Operation 
(Paired  Weights  Nohell 


I  iiiiir.  Sll.  1‘iistpriH'i'ssint:  Board 


In  1  igurc  s  I .  a  simplified  hi. >s  k  diagram  ol  the  postprocessing  electronics.  input  signals  01)111 
01)1  P  anj  O I )  \  I  I  represent  tire  data  'ruin  tin  Pll’f  l  Sh  output  controller.  The  output  seleel  signals 
are  dci  oil.  d  l’r.  mi  t  lie  brasses  >ani's  'mi;  t  panel  I  o:  m.  ten  it  tide  •  >pei  a  lions,  data  inputs  C )  I )  1 1 1  and  01)1  O. 
w  hieli  represent  the  outputs  m  i.iili  set  <>l  |,Hii  IMI’I  I  Sit  s.  aie  used  lot  finding  the  maximum  of  eight 
inputs  o:  lot  no  operation,  the  data  inputs  t  >1)  \i  I  which  lepieseni  the  outputs  ol  all  eight  PI  HE-.  I  S I  (  s. 
aie  used 

I  lie  sum -ol-abs.  •lute-'-  allies  lun.  ii.m  is  impl  men  led  toi  enuiili  \  'implicit;. .  file  absolute  \  alue  of 
negatr.  e  input  values  tl  >  high)  is  immed  in  adding  I  to  die  !\  ei.mplement  ol  the  negative  values.  II  I) 
is  Ion  mdi...titig  a  pi >s] 1 1 v e  cipu!  v  alu-  tin  '  alu.  ,s  if, ‘k  hanged  I1  a  positive  overllovv  exists,  the  value 
is  p.i-.sed  il  a  neg.il r.  e  . • . , ;  i!< * m  as  th.  >  .due  is  passed  to  the  absolute-value  an  uitrv 

\  .  o.i  i  se  'I'O  ,dg.  .:,i  .  i :  a  o  ol!.  ui.n.d  .wi':1  the  ue .ii  in  is  ol  t  lie  li«  ii  i/.  in  t.i!  anil  i  ertn  a  I  responses 
and  know !;  dei  .  !  tin-  maenn.id.  -  \  ,c  .1  \  a:  .•  .lelfiied  as  the  sien  in:  ol  tiie  Inm/ontal  and  vertieal 

response,  ivspc.  t ;  %  ■.  ■! and  \  ti:  n;.n  limidt  .a  die  In';  i/oni.ii  :cs|  mi-  ■■  : . giealei  lll.lll  the  vertical 

lesponse  I  nmi c  ' _  si  .  u s  ; :i  .  :  i - .  ••  ;u  .  I:.  ,e>  mis  "  it.oiish.p  ol  liic  sign  bits  alld 

Ui.lgli  i  t  tides  III.  i.igti  i  l :  [liiilie:  -  ’at  .o.i  .a  li,.-  .  .i. .  dm  .  tioi.  >s  sti.eetitl.  •  n.ml  .1*1.1  is  also  shown  ill  figure 
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Figure  51.  1*1  l*F  Demonstration  Brassboard  Postprocessing 

absolute-value  functions  already  exist.  The  3  output  bits  I) ...  I),.  and  I)  represent  the  edge  direction 
quaiiM/vd  ti>  15  degrees. 

The  maximum-of-eight  input  is  also  straightforward.  The  output  of  a  PIF'F  I  .SIC '  is  loaded  into  an 
X-bit  latch  and  compared  to  the  next  available  L.SIC  output:  if  the  most  recent  PIPE  I.SIC  output  is  larger 
loan  the  value  stored  in  the  latch,  it  is  loaded  into  the  latch:  if  it  is  not  larger,  the  value  in  the  latch  i., 
unchanged.  After  all  eight  PIPE  I  .SIC  outputs  hav  e  been  tested,  the  latch  contains  the  maximum  v  alue.  If 
a  positive  overflow  condition  exists,  the  value  255  is  displayed:  if  a  negative  overflow  exists,  a  0  is 
displayed.  For  no-operation  postprocessing,  only  positve  values  bet  wen  0  and  255  are  displayed:  negative 
values  are  replaced  by  0.  and  values  greater  than  255  are  replaced  by  255. 

The  operator  has  complete  control  over  the  postprocessing  function,  selecting  the  function,  magni¬ 
tude.  maximum-of-eight  or  no-operation  processing  to  pass  to  the  refresh  memory. 

6.  Refresh  Memory 

The  refresh  memory  has  two  functions:  it  accepts  data  from  the  postprocessing  electronics  and 
prov  ides  digital  words  to  the  l)A(  for  reproducing  analog  v  ideo.  These  functions  cannot  be  performed  at 
the  same  speed  and  thus,  must  be  synchroni/cd.  Implementation  of  the  refresh  memory  is  almost  identi¬ 
cal  to  that  ol  the  buffer  memory  ,  differing  only  slightly  in  the  write  and  output  sections  to  av count  for  the 
output  controller  and  postprocessing  timing,  i  he  refresh  memory  stores  a  complete  video  frame  (512  ■ 


Kipure  52.  H.dne  Direction  Definition  (  \bo\e)  and  Implementation  (Below) 
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Hyurt1  53.  Refresh  Memory  Boards 


512  X  8)  and  requires  four  wire-wrap  hoards  as  shown  in  Figure  53.  Each  board  holds  two  hit  planes,  i.e.. 
512  X  512  X  2  bits. 

Figure  54  is  a  block  diagram  of  the  refresh  memory  circuiry.  As  in  the  buffer  memory,  the  refresh 
memory  is  partitioned  into  odd  and  even  fields  and  odd  and  even  lines.  Also,  the  y  address  is  modified 
using  the  same  >  address  lines  and  address  modifiers  C  l  through  (  4  as  in  the  buffer  memory.  In  the 
refresh  memory  ,  however,  the  x  address  is  also  modified:  this  is  necessary  to  compensate  for  the  delays  in 
passing  the  data  from  the  input  to  the  output  of  the  brassboard.  The  x  address  is  modified  by  using 
control  lines  BO  through  B7.  which  are  inputs  to  the  fast  adders  used  for  y  address  modification  and 
which  are  also  constant  because  the  delays  through  the  system  are  constant  for  a  given  operation.  Another 
distinction  versus  the  buffer  memory  is  that  the  refresh  memory  does  not  provide  the  simultaneous 
outputs.  This  reduces  the  number  of  output  multiplexers  to  one  and  the  number  of  output  select  lines  to 
three.  The  write  circuitry  for  the  refresh  memory  boards  is  also  slightly  different.  On  certain  operations 
done  by  the  PIPE  LSICs.  a  single  output  word  is  produced  every  16  clock  cycles:  therefore,  two  sequential 
PIPE  LSK  outputs  are  not  adjacent  pixels  on  the  display.  This  requires  that  each  memory  cell  (i.e..  a 
memory  device  at  row  x.  column  y  in  Figure  54  be  written  independently  and  exclusively  of  all  other 
memory  cells.  Additional  write  enable  decoding  hardware  provides  16  independent  write  pulses 
(WB!  I-WB44).  Control  lines  YO.  OOOFf.O,  and  WR  provide  a  write  pulse  for  the  desired  row  (i.e.. 
OFOE.  OFEE.  etc.)  while  control  lines  F.NWB1.  ENWB2.  ENWB3.  and  ENWB4  determine  which 
column  or  columns  of  the  row  receive  a  write  pulse.  Therefore,  writing  to  any  given  memory  cell  may  be 
done  independently. 

Reading  data  front  the  refresh  memory  is  identical  to  the  buffer  memory  read  operation. 
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I  it*urv  M>.  S\ ndi rom/iit ion  and  I  inline  Board 

7.  Synchronization  and  Timing 

I  ho  PIPF  demonstration  biassboard  contains  all  the  synchronization  and  inning  needed  to  control 
each  of  i is  functions.  As  shown  m  die  block  di  igrain.  f  igure  55.  the  timing  board  (pictured  in  Figure  56) 
generates  the  necessary  synchronization  Uu  the  t  Ollt '  44(H)  camera,  as  well  as  the  horizontal  and  verti¬ 
cal  blanking  and  sync  and  composite  blanking  and  sene  for  digili/mg  and  later  creating  RS-170  compat¬ 
ible  video.  The  \  and  y  addresses,  row  address  strobe,  column  address  strobe,  write,  load,  and  address 
modifiers  for  the  butler  and  refresh  memories  aie  generated  on  the  timing  board. 

Camera  synchronization  is  implemented  In  a  commercially  available  I  V  camera  sync  generator  1C 
requiring  a  I  260-MHz  reference  A  minimum  sampling  time  of  102.5  ns  is  needed  to  acquire  512 
samples  along  a  horizontal  line  of  1  1  \  Standard  KS~  I  Ai  video,  fins  corresponds  to  a  4.75-MHz  clock 
rate,  which  is  not  an  integer  multiple  ol  the  camcia  sync  generator  relerence.  and  a  slightly  higher  10.08- 
MHz  sample  clock  was  used  1  lie  effect  oi  this  taster  clock  is  indicated  m  figure  57.  For  the  PIPF 
brassboard.  the  active  horizontal  Inu  time  is  si >.N  es.  a  decrease  of  I  A  >is  from  the  F1A  Standard  RS- 1  70 
video  Ihis  minor  deviation  does  not  ailed  the  perlormaiue  ol  the  biassboard  and  greatly  simplifies 
sy  neh  roil  izat  ion  \  4n  52-M 1  Iz  i  i  \  s'.al  osedlati  a  is  user!  as  the  master  <.  lock  for  the  brassboard  to  permit 
precise  eonltol  ol  the  various  functions 


Hgiirt*  5*\  l*  |  \  Standard  KS  1 70  and  PIPP  Brass board  Horizontal  liming  Relationships 


Horizontal,  vertical,  and  composite  blanking  and  synchronization  are  generated  from  the  v.  and  y 
addresses.  The  9-bit  x  address  is  implemented  with  synchronous  counters  clocked  by  the  gated  10.08- 
MHz  clock.  The  x  address  counts  from  0  to  51 1  during  the  active  horizontal  line  time;  during  the  512  to 
640  count,  the  horizontal  blanking  pulse  is  active,  and  at  640  the  x  address  generator  is  cleared  to  0. 
Correspondingly,  during  the  531  to  579  count,  the  horizontal  synchronization  pulse  is  active.  The  x 
address  generator  is  preset  to  count  579  by  the  horizontal  reset,  which  is  generated  from  the  standard 
composite  sync  generator  1C.  Figure  57  shows  the  x  address  count  for  the  various  horizontal  timing 
relationships. 

The  y  address  generator  is  similar  to  the  x  address  generator.  The  y  address  generator  produces  8-bit 
addresses  I  he  standard  horizontal  sync  pulse  generates  control  signal  CLAMP  during  the  horizontal 
back-porch  interval,  clocking  the  y  address  generator.  The  vertical  sync  pulse  resets  the  y  address  genera¬ 
tor  to  0.  V  ertical  blanking  occurs  when  the  ripple  carry  ouputs  of  the  y  address  counters  indicate  an 
overflow  condition.  Vertical  sync  is  implemented  using  up/down  counters  clocked  by  the  1.260-MHz 
clock;  the  standard  composite  sync  IC  controls  the  direction  of  count.  Composite  sync  and  composite 
blanking  are  generated  by  logical  ORing  of  the  horizontal  and  vertical  sync  and  blanking,  respectively  . 

The  ninth  bit  of  the  y  address  is  an  even,  odd  field  indicator.  It  is  generated  by  counting  the  number 
of  transitions  of  the  fifth  bit  of  the  x  address  (x4)  during  the  period  beginning  with  vertical  blanking  and 
ending  with  vertical  sync;  this  is  equivalent  to  counting  equalizing  pulses  of  broadcast-format  video.  The 
number  of  transitions  of  x4  during  a  count  is  compared  to  the  number  of  transitions  during  the  previous 
count:  if  there  are  more  transitions  in  the  most  recent  count,  it  is  the  even  field;  otherwise,  an  odd  field  is 
indicated. 

The  memory  controller  uses  the  9-bit  x  address.  8-bit  y  address,  even/odd  field  indicator,  composite 
blanking,  gated  10.08-MHz  clock,  and  parallel  serial  control  line  to  create  the  row  and  column  address 
strobe,  write  pulses,  address  modifiers,  and  multiplexed  address  for  the  buffer  and  refresh  memories. 

The  memory  controller  generates  only  a  7-bit  multiplexed  x  and  y  address  for  the  buffer  and  refresh 
memories.  As  four  values  are  written  into  memory  is  parallel,  the  x  address  is  reduced  by  2  bits.  Addi¬ 
tional  y  address  bits  (YO  and  ODDFLD)  are  used  by  the  memory  boards  to  achieve  full  9-bit  y  address¬ 
ing.  A  write  enable  (WE)  control  line  from  a  pushbutton  on  the  brassboard's  front  panel  implements  a 
frame-freeze  function  by  inhibiting  the  write  pulses  to  the  buffer  and  refresh  memories. 

8.  Control  Panel 

All  operations  of  the  PIPE  brassboard  are  controlled  by  the  user  at  a  panel  (shown  in  Figure  58).  The 
front  panel  contains  a  1-line.  24-character  liquid  crystal  display  (LCD):  a  16-character  (0-9.  A-D.  *.  #) 
key  pad:  and  a  thumbwheel  switch.  An  8-bit  microprocessor  and  peripheral  interface  device  on  the 
brassboard  controller  permits  the  user  to  set  up  the  demonstration  brassboard  to  implement  a  selected 
algorithm.  The  CPU  queries  the  user  about  the  type  of  operation  (parallel  or  serial,  sliding  or  nonsliding), 
the  type  of  input  data  (2’s  complement  or  magnitude),  the  type  of  memory  coefficients  (2's  complement 
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Killin'  58.  F  ront  Panel  Brassboard  Controller 


or  magnitude),  the  length  of  the  input  word,  and  the  type  of  postprocessing  desired.  The  controller  also 
asks  how  the  weighting  coefficients  of  the  eight  PIPE  LSICs  are  arranged:  i.e..  same  weights  in  all  the 
LSICs.  different  weights  in  each  LSIC,  or  paired  weights.  The  controller  notifies  the  operator  if  a  non  valid 
response  has  been  entered;  for  example,  a  letter  input  when  a  number  is  required.  If  the  operator  requests 
an  undefined  operation,  the  controller  notifies  the  user.  On  request  (#  key),  the  controller  displays  the 
current  operating  parameters.  The  *  key  is  used  to  begin  the  process  of  setting  up  new  operating  param¬ 
eters.  The  software  required  for  the  CPU  is  stored  in  a  2K  X  8  EPROM  and  can  be  easily  changed  to 
upgrade  or  modify  the  operations  of  the  brassboard. 

The  thumbwheel  switeh  on  the  front  panel  is  used  to  threshold  the  output  and  has  two  modes  of 
operation  to  enhance  the  output  image;  either  the  output  values  above  the  threshold  are  set  to  gray-level 
255  or  the  values  below  the  threshold  are  set  to  gray-level  0. 

On  the  key  pad.  symbols  A,  B,  ('.  and  D  automatically  instruct  the  brassboard  to  perform  certain 
predefined  operations.  The  A  key  represents  Sobel  operation:  sliding,  parallel.  K-bit  magnitude  input 
data:  2's  complement  weighting  coefficients  arranged  in  pairs;  and  magnitude  postprocessing.  When 
power  is  initially  applied  to  the  brassboard.  the  controller  defaults  to  Sobel  operation.  The  B  key  repre¬ 
sents  operations  requiring  sliding,  parallel.  8-bit  magnitude  input  data:  2"s  complement  weighting 
coefficients:  different  weights  for  each  PIPE  LSIC:  and  maximum-of-eight  output  postprocessing.  This 
operation  is  used  for  template-matching  algorithms  (i.e..  compass  gradient.  Kirsch.  etc  ).  Operations 
selected  by  the  C  key  are  similar  to  B-key  operations  except  that  the  same  weights  are  in  each  PIPE  LSIC 
and  a  “no-op"  postprocessing  function  is  selected;  this  allows  a  sliding  5  X  5  filter  to  be  implemented  at 
the  maximum  frame  rate.  The  only  difference  between  (  -key  and  l)-key  operations  is  that  the  latter 
selects  nonsliding  operations,  allowing  processing  on  3  X  3  contiguous  blocks  of  pixels. 
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The  brassboard  controller  is  implemented  on  a  board  located  in  the  left  rear  quarter  of  the 
brassboard  and  connected  to  the  front  panel  and  other  brassboard  functions  through  40-pin  ribbon  cable. 

9.  Summary 

The  main  design  goals  of  the  PIPE  demonstration  brassboard  were  the  ability  to  fully  evaluate  the 
PIPE  LSIC  and  the  flexibility  to  evaluate  future  image-processing  hardware  developments.  Those  goals 
have  been  achieved  by  functionally  partitioning  the  brassboard  into  four  major  sections: 

•  Analog-to-Digital  Conversion 

This  half-board  digitizes  standard  RS-170  video  from  a  COHU  4400  camera  into  8-bit 
digital  data. 

•  Memory 

Eight  boards  of  the  brassboard  are  used  to  implement  a  frame  buffer  and  refresh  mem¬ 
ory.  Both  memories  store  a  complete  video  frame  (512X512X8  bits)  and  are  designed 
with  16K.  X  1  dynamic  RAMs.  Each  board  holds  two  bit  planes  (i.e..  512  X  512  X  2 
bits). 

•  PIPE  LSIC  and  Postprocessing 

These  three  boards  contain  eight  PIPE  LSICs.  their  input  and  output  control  circuitry, 
and  the  postprocessing  functions.  These  boards  can  be  replaced  by  future  processor 
boards  and  evaluated  using  real-time  video. 

•  Digital-to-Analog  Conversion 

This  half-board  converts  the  digital  data  from  the  refresh  memory  into  analog  data  and 
provides  the  necessary  synchronization  to  create  RS-170-compatible  composite  video 
for  the  TV  monitor. 

The  partitioning  of  the  system  is  shown  in  Figure  59.  A  synchronization  and  timing  board  to 
generate  the  necessary  clocks  to  control  each  of  the  brassboard  functions  makes  the  brassboard 


t  igurv  59.  I’ll’K  Demonstration  Brassboard 
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TABLE  9.  POWER  AND  SIZE  OF  PIPE 
DEMONSTRATION  BRASSBOARD 


Function 

ADC  DAC 
Buffer  memory 
PIPE  LSIC's  IC  interface 
Postprocessing 
Refresh  memory 
Timing  and  control 
Front-panel  controller* 

Total 


Size 

Power 

No.  of 

(in.1) 

(W) 

Boards 

36 

14.1 

1 

144 

18.6 

4 

72 

15.3 

2 

36 

9.4 

1 

144 

20.3 

4 

36 

6.9 

1 

28 

2.0 

1 

496 

86.6 

14 

•The  front  panel  controller  board  is  external  to  the 
rack  assemhlv. 


completely  self-contained.  To  communicate  the  type  of  desired  operations,  an  interface  between  the  user 
and  the  PIPE  LSIC's  is  implemented  through  a  front-panel  controller. 

The  brassboard  hardware  is  shown  in  Figure  60.  The  lower  photograph  shows  the  brassboard  with 
the  front  panel  lowered,  exposing  the  13  wire-wrap  boards  on  which  the  brassboard's  functions  are 
constructed. 

The  power  and  si/e  of  the  demonstration  brassboard  are  indicated  in  Table  9.  The  memory  function 
has  the  largest  power  and  si/e  requirement,  but  its  inclusion  gives  the  desired  flexibility. 

The  brassboard  is  designed  to  operate  in  real  time.  i.c..  thirty  512  X  512  X  8-bit  video  frames  s. 
The  ADC.  buffer  memory,  postprocessing,  refresh  memory,  timing  functions,  and  DAC  are  capable  of 
operating  with  a  100-ns  clock.  The  access  time  of  the  EPROM  on  the  PIPE  LSICs  prohibits  operating 
faster  than  approximately  200  ns.  and  multiple  passes  through  an  image  arc  required  to  process  an  entire 
frame.  In  Table  10  are  the  frame  rates  for  the  various  operations.  These  reduced  frame  rates  are  for  the 
PIPE  LSICs  only  and  do  not  prohibit  real-time  evaluation  of  future  processor  developments. 


TABLE  10.  FRAME  RATE  FOR  VARIOl  S  OPERATING  C  HARACTERISTICS 


Operation 

Sliding 

Nonsliding 

Input  Weights 

Data 

Same 

Different 

Paired 

Same 

Different 

Paired 

Parallel 

15 

1.875 

7.5 

30 

5 

X 

Serial 

1  s 

x 

x; 

> 

15 

X 

*  Operations  not  defined 


O.  BRASSBOARD  DEMONSTRATION 


A  major  objective  of  the  brassboard  is  to  demonstrate  the  versatility  ol  the  PI  PI  L.SK  .  To  demon¬ 
strate  the  PIPE  LSIC  capability  to  perform  vector  or  transform  operation  requires  a  serial  data  path  into 
the  PIPE  LSIC'  and  the  necessary  processing  of  the  PIPE  L.SK  output  for  display  on  the  TV  monitor. 
These  operations  are  supported  by  the  brassboard.  but.  because  of  the  serial  nature  of  the  data,  the  display 
does  not  provide  the  observer  an  interesting  image.  A  more  meaningful  demonstration  of  the  brassboard 
is  the  parallel  mode  of  operation.  These  operations  show  the  flexibility  of  both  the  brassboard  and  PIPE 
LSIC.  Neighbor!!  »d  operators  were  used  during  the  hardware  evaluation  and  to  demonstrate  the 
brassboard. 

The  testing  of  the  various  functions  of  the  brassboard  utilized  a  3  x  3  sliding  same-weight  data  in 
all  eight  PIPE  LSIC's.  no  postprocessing  operation  with  unity  operator;  all  0‘s  in  the  3  x  3  window  except 
lor  a  value  of  4  in  the  center  pixel  location  Recall  that  the  PIPE  LSIC  outputs  are  shifted  by  2  bits;  thus,  a 
value  of  4  rather  than  I  is  placed  in  the  center  pixel  location.  W  ith  the  unity  operator,  the  output  of  the 
brassboard  should  be  a  replica  of  the  input;  any  noise,  ghosting,  or  other  artifacts  are  readily  observed. 

After  all  the  brassboard  functions  were  operational,  the  weighting  arrays  in  the  PIPE  LSIC's  were 
changed  to  demonstrate  more  interesting  image- processing  operations.  A  common  and  important  image- 
processing  operator  is  low-pass  filtering.  Several  weighting  arrays  are  suitable  for  low -pass  filtering,  but 
basically  they  all  calculate  the  mean  of  the  intensity  levels  in  a  small  neighborhood  of  pixels,  f  igure  61 
demonstrates  the  use  of  the  PIPE  LSICs  to  implement  a  low-pass  filter.  The  weighting  array  was  pro¬ 
grammed  into  all  eight  PIPE  LSIC's,  allowing  maximum  throughput.  The  no-postprocessing  function  was 
used.  As  previously  discussed,  the  PIPE  L  SIC  EPROM  has  an  access  time  of  approximately  200  ns. 
requiring  two  passes  through  the  image;  this  represents  a  1 5 - H /  frame  rate.  Considerable  blurring  of  the 
image,  indicative  of  low-pass  filtering,  can  be  observed. 

Another  important  image-processing  operator  is  the  edge  detector.  The  two  most  common  ty  pes  are 
differential  edge  detectors  and  template-match  edge  detectors.  Differential  edge  detectors  require  spatial 
convolutions  of  the  input  image  with  both  horizontal  and  vertical  weighting  array  s.  The  outputs  of  the 
two  convolutions  are  combined  to  produce  an  edge  magnitude.  Figure  62  shows  the  result  of  implement¬ 
ing  the  Sobel  differential  edge  detector  w  ith  a  sunt  of  absolute  values  as  the  magnitude  technique.  To 
improve  throughput,  four  PIPE  L.SK  s  calculate  the  horizontal  response  while  the  other  four  calculate  the 
vertical  response.  The  magnitude  postprocessing  (unction  is  used.  I  his  corresponds  to  a  3  X  3 
sliding  paired-weights  magnitude  function.  Only  four  PIPE  L  SK  s  are  operating  in  parallel  on  one  direc¬ 
tion  and  the  throughput  for  the  Sobel  operation  is  7.3  fra?  les  second  for  6  I  2  X  >12  3-bit  images. 

In  template-matching  edge  detection,  a  set  of  weighting  arrays  corresponding  to  the  eight  major 
compass  directions  (north,  northeast,  east,  etc.)  is  conxohed  with  the  input  image.  The  PIPE  brassboard 
easily  implements  template-matching  edge  detection  by  using  each  PIPE  I  SK  to  calculate  the  response 
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for  one  compass  direction.  For  eight  compass  directions,  the  eight  PIPE  LSIC's  cannot  operate  in  parallel 
on  one  direction;  therefore,  the  throughput  for  template-match  edge  detection  is  I.S’5  frames  s  I  he 
maximum-of-cight  postprocessing  function  is  utilized.  Figure  63  shows  the  PIPE  hrasshoard  implement¬ 
ing  the  fifth-level  template-match  edge  detector.  The  south  and  west  weighting  arravs  ot  the  lifih-level 
template-matching  edge  detector  are  identical  to  the  weighting  arrays  used  to  calculate  the  Sobel  edges 
The  addition  of  extra  weighting  arrays  makes  the  response  of  the  template-match  edge  deles  tor  more 
exact  than  the  response  of  the  sum  of  absolute  values  used  in  the  Sobel  edge  detector  The  edge  responses 
shown  in  Figures  62  and  63  are  similar,  but  the  template-match  response  is  stronger  than  the  Sobel 


response. 
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